Security hardening: 30 exploitable vulnerabilities (batches 1–7)#38
Conversation
…ile-size limits, serve CSRF, untrusted wrappers)
…batch_patch, transcribe, tree width)
…nscribe output, session_search get)
…ession_search wrapping, resource size, subagent summary, patch expansion)
…le/element wrapping + cap, config size cap, resource resolver symlink Lstat)
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
odek | 47fae17 | Commit Preview URL Branch Preview URL |
Jun 13 2026, 12:41 PM |
vprotocol finding (axis 2.1/2.9): the maxHeadTailTotalBytes comment claimed a "total across all files" guarantee that the code enforces per-file. Correct the comment to state the cap is per-file and that the ~10 MiB aggregate bound comes from the separate 10-file-per-call limit. Add TestHeadTail_CapsOutputSizeMultiFile to lock both the per-file cap and the aggregate bound across the max 10 files. Additive only — no behavior change to security logic. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🔏 Verification Certificate — AI Verification Protocol v5.2.7Ran the AI Verification Protocol against this PR's diff (8 commits, ~1,790 insertions across 30 files). Classification: KnownGroundTruth — human-authored security fixes, each with a TDD test for the exact change ( Axes Summary
Signals (qualitative)
Findings & Remediations (auto-applied, additive — §7.1)
Repair Gate (§7.5): ✅ build passes · ✅ full Verdict: AutoApprove (single-reviewer caveat →
|
CodeQL flagged uncontrolled user data (the @-reference query) reaching a
path expression in FileResolver.Search. A query like "../../etc/passwd"
makes filepath.Join(root, query) clean to a path outside root, so
filepath.Glob matched files the workspace must not expose and os.Lstat
leaked their metadata (size, etc.) — the same metadata-disclosure class
this PR hardens elsewhere.
Add a separator-aware withinRoot() confinement primitive and apply it to
every Search match before touching the filesystem. Reuse it in Load,
replacing the prior HasPrefix(absTarget, absRoot) check that also fixed a
boundary bug ("/foo" matching "/foobar").
TestFileResolver_SearchOutsideRoot fails against the pre-fix code
(leaks @../secret.txt) and passes with the guard.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(sec): scope audit ingest recorder to run context (#20) Replace the package-global ingestRecorder callback with a context.Context value carried through the agent loop. This removes the race where concurrent WebSocket sessions could overwrite each other's audit attribution. Key changes: - internal/loop: add WithIngestRecorder / IngestRecorderFrom helpers. - cmd/odek/untrusted.go: wrapUntrusted now reads the recorder from ctx; remove setIngestRecorder globals. - cmd/odek tools: embed ctxTool and pass toolCtx() to wrapUntrusted. - cmd/odek serve/main: inject per-session/per-prompt recorder into ctx. - odek.go: toolAdapter forwards SetContext to odek.Tool implementations. - Tests: add unit, integration, and concurrency regression tests. Fixes finding #20 from sec_findings.md. * fix(sec): scope prompt cancel by session ID (#19) Replace the global currentPromptCancel atomic.Value with a mutex-protected map keyed by session ID. POST /api/cancel now requires a session_id query parameter, so one WebSocket connection cannot cancel another connection's running prompt. Changes: - cmd/odek/serve.go: add promptCancels map + register/unregister/cancel helpers; handlePrompt registers/unregisters the cancel func for the session it runs; handleCancel requires session_id. - cmd/odek/ui/app.js: include sessionId in cancel fetch URL. - cmd/odek/serve_test.go: update existing cancel tests to pass session_id; add TestServe_Cancel_MissingSessionIDReturns400 and TestServe_Cancel_CannotCrossSessions regression test. Fixes finding #19 from sec_findings.md. * fix(sec): redact Session.Task before persisting (#21) Store.Save already redacted Message.Content and ReasoningContent, but the session Task field (the first user prompt / session title) was written to disk verbatim. Secrets in the first prompt leaked into the session file and index, and were returned by the session APIs. Changes: - internal/session/session.go: redact sess.Task in saveLocked before marshalling and updating the index. - internal/session/session_test.go: add TestStore_SaveRedactsTask covering in-memory, on-disk, and index redaction of the Task field. Fixes finding #21 from sec_findings.md. * fix(sec): confine glob/search_files patterns to workspace (#22) Replace filepath.Glob in glob and search_files(target=files) with a workspace-confined walk helper. filepath.Glob follows symlinked directory components and resolves .. segments, allowing patterns like link/*.txt or ../.ssh/id_* to enumerate files outside the working directory. Changes: - cmd/odek/file_tool.go: add confinedGlob helper using filepath.WalkDir, skipping all symlinks, rejecting .. and absolute patterns, and verifying every match stays inside the resolved root. Refactor globTool.Call and searchFilesTool.searchFiles to use it. - cmd/odek/security_vulnerabilities_test.go: add TestGlob_DotDotPatternRejected, TestSearchFiles_DotDotPatternRejected, and TestGlob_AbsolutePatternRejected. Fixes finding #22 from sec_findings.md. * fix(sec): nonce sub-agent untrusted-input fence (#24) buildSubagentRequest previously wrapped untrusted parent input in constant <untrusted_input> tags with no per-call nonce and no neutralisation of the marker string inside the body. A crafted </untrusted_input> in the goal or context could close the fence early and inject instructions. Changes: - cmd/odek/subagent.go: add wrapUntrustedSubagentInput and neutraliseSubagentInputLiterals helpers. Generate a random nonce per request, emit <untrusted_input_<nonce>> tags, and replace literal occurrences of untrusted_input with a look-alike so injected close tags cannot pair with the wrapper. - cmd/odek/subagent_prompt_isolation_test.go: update TestSubagentRequest_UntrustedIsFenced for nonce'd tags; add TestSubagentRequest_UntrustedNeutralisesCloseTag regression test. Fixes finding #24 from sec_findings.md. * fix(sec): reject symlinks in sandbox --ctx injection (#25) InjectFiles used os.Stat, which follows symlinks, and only verified that the symlink path itself was under cwd. A symlink committed inside the project pointing at an arbitrary host file (e.g. leak -> /etc/shadow) would have its target copied into the container. Changes: - internal/sandbox/sandbox.go: use os.Lstat in InjectFiles; reject any ctx file that is a symlink. Resolve cwd to an absolute path and use isPathUnder for the relative-path check. - internal/sandbox/sandbox_test.go: add TestInjectFiles_SkipsSymlink. Fixes finding #25 from sec_findings.md. * fix(sec): allowlist sandbox network modes (#26) BuildRunArgs only rewrote 'host' to 'none', but other Docker network modes such as 'container:<name>' were passed through unchanged. That allowed the sandbox to share another container's network namespace and bypass intended network isolation. Changes: - internal/sandbox/sandbox.go: enforce an allowlist of 'none' and 'bridge' for --sandbox-network; reject any other value (including 'container:...') by forcing it to 'none' with a warning. Empty Network defaults to 'bridge'. - internal/sandbox/sandbox_test.go: add tests for container:, bridge, empty, and host network modes. Fixes finding #26 from sec_findings.md. * fix(sec): FD-based API key handoff for Telegram restart child (#28) spawnChild copied os.Environ() and re-injected ODEK_API_KEY, DEEPSEEK_API_KEY, and OPENAI_API_KEY, leaving the key visible in the child process environment (/proc/<pid>/environ) and crash dumps. Changes: - cmd/odek/telegram.go: reuse the existing FD-based key handoff helpers (writeKeyToUnlinkedFile / readKeyFromInheritedFD). telegramCmd reads the key from if present; spawnChild writes the resolved key to an unlinked tempfile and passes the FD to the child via the safe ODEK_API_KEY_FD signal env var instead of the key itself. - cmd/odek/telegram_test.go: replace env-injection test with a ProcAttr interception test that verifies the key is passed via FD 3 and absent from the child env. Fixes finding #28 from sec_findings.md. * fix(sec): harden Telegram document filename sanitization (#30) sanitizeDocName only stripped directory components and rejected empty/. /.. names. It allowed hidden files, arbitrary characters, and very long filenames, so an attacker could send a document named '.bashrc' or an overlong filename that would be saved under ~/.odek/media/. Changes: - internal/telegram/download.go: reject names starting with '.', restrict characters to [A-Za-z0-9._-] (replacing others with '_'), enforce a maxDocNameLen of 200 while preserving the extension, and factor out fallbackDocName. - internal/telegram/download_test.go: add test cases for hidden files, unsafe characters, Unicode names, and overlong names. Fixes finding #30 from sec_findings.md. * Fix #31: decline auto-save of tainted skills; require --force to promote - AutoSaveSuggestions now skips suggestions with untrusted provenance unless allowUntrusted is true. - RunAutoSaveLoop declines tainted skills by default and logs them. - promoteSkill refuses to clear NeedsReview on tainted skills without --force. - CLI help, SECURITY.md, LEARNING.md, README.md, and AGENTS.md updated. - Added/updated tests for promote refusal/force and auto-save decline/allow paths. * Fix #34: escalate interpreters that can invoke shell commands to code_execution - Add embeddedShellInterpreters map for awk/gawk/mawk/nawk, ed/ex, vi/vim/nvim/view, emacs/emacsclient. - Classify their non-version/help invocations as code_execution. - Detect sed 'e' command, -f/--file, and s///e flag as code_execution. - Remove awk from writePrefixes; add embeddedShellInterpreters to isKnownCommandName. - Add regression tests for awk, sed, vim, find -exec bypasses. - Update SECURITY.md and AGENTS.md. * Fix #37: cap MCP server stdout line size to prevent OOM - Replace unbounded bufio.Reader.ReadString with bufio.Scanner limited to 10 MiB per line. - On oversized line, report error to pending callers and close the connection. - Add TestReadLoop_OversizedResponse regression test. * security: harden MCP tools/list metadata trust (#38) - Validate MCP tool names in internal/mcpclient/client.go Discover: ASCII letters/digits/_/-, non-empty, <=64 chars. - Reject raw MCP tool names that shadow odek built-ins in cmd/odek/main.go loadMCPTools, even though prefixed names are normally used. - Add per-tool approval for project-level MCP servers in cmd/odek/mcp_approval.go, persisted in ~/.odek/mcp_tool_approvals.json. - Reuse ODEK_APPROVE_MCP env var for both server and tool approvals. - Add unit tests and E2E coverage for name validation, shadowing, and approval gating. - Update docs/SECURITY.md, AGENTS.md, and docs/MCP.md. * security: cap file sizes in read-only perf tools and inline inputs (#39, #40) - Add 10 MiB size checks to count_lines, checksum, head_tail, and word_count before scanning/hashing, matching other perf tools. - Add maxInlineContentBytes (10 MiB) and enforce it on base64 inline string/content and tr inline content arguments. - Add tests for each new size-cap rejection path. - Update docs/SECURITY.md and AGENTS.md. * test+docs: increase coverage of size-cap changes and keep docs consistent - Add exact-size boundary tests for count_lines, checksum, head_tail, word_count, base64 inline content, and tr inline content. - Add tail-mode and decode-string oversized rejection tests for head_tail and base64 to cover both branches. - Add CHEATSHEET note listing the perf tools with 10 MiB file/inline caps. * security: fix schedule locking/JSON caps and nonce tool-result delimiters (#41-#43) - internal/schedule/store.go: fileLock now returns an error instead of a silent no-op fallback; all mutating callers (Add, Put, Remove, SetEnabled, SaveState) abort when the flock cannot be acquired. - internal/schedule/store.go: readJSON now Stat()s schedule/state files and rejects anything larger than 10 MiB before reading. - internal/loop/loop.go: tool-result delimiter now embeds a per-call random hex nonce in both the opening and closing lines, preventing a tool or MCP server from forging the closing delimiter. - Add/update tests for lock failure, oversized schedule file, exact-size boundary, and nonce uniqueness. - Update docs/SECURITY.md and AGENTS.md. * security: fix parallel_shell/batch_patch/browser/telegram/restart findings (#44-#48) - parallel_shell: bind commands to agent context via CommandContext, run each in its own process group, kill the group on cancellation/timeout, and cap per-command timeouts at 30 minutes. - batch_patch: add trustedClasses field and pass it to CheckOperation for consistent approval behavior with write_file/patch. - browser: wrap clickableRef.URL as untrusted; keep rawURL for internal click resolution. - telegram: enforce MaxMsgLength in UTF-16 code units instead of bytes. - telegram: write restart.json with 0600 instead of 0644. - Add regression tests for all five fixes. - Update docs/SECURITY.md and AGENTS.md. * test+docs: increase coverage of #44-#48 changes and keep CHEATSHEET current - Add TestParallelShell_ContextCancel_Explicit to cover the context.Canceled branch in runOne. - Add TestBrowser_Click_ButtonAcknowledges and TestBrowser_Click_InputSubmitAcknowledges to cover button/submit click paths and increase browser_tool.go coverage. - Update docs/CHEATSHEET.md: parallel_shell process-group kill, browser URL wrapping, Telegram flock/0600 restart marker/UTF-16 length limits. * security: fixes #50-#53 (Telegram MarkdownV2 escape, resource limit cap, WS connection limits, sub-agent progress bounds) - #50: Escape agent-generated text in send_message before Telegram MarkdownV2 send - #51: Cap /api/resources limit to 100 in handler and Registry.Search - #52: Enforce max 20 concurrent WebSocket connections + per-IP upgrade rate limit - #53: Cap sub-agent progress stream at 100k lines / 100 MiB and cancel child on overflow Tests and docs (SECURITY.md, CHEATSHEET.md) updated. * security: harden #54-#59 (subagent cleanup, fsatomic, resource search, schedule perms, config TOCTOU, flock docs) - Scope sub-agent --task file deletion to odek temp files only - Create fsatomic temp files with exact perms via O_EXCL + random suffix - Sanitize resource search query: 256-byte cap + glob metachar escaping - Create schedule directory with 0700 permissions - Read config via single Open+LimitReader to remove size-check TOCTOU - Document advisory flock semantics in package doc - Update SECURITY.md with defenses 43-48 and attack-vector rows
Summary
This PR closes 30 exploitable vulnerabilities discovered in the odek agent runtime through a TDD-driven hardening pass. Each fix is paired with a RED regression test; the branch is kept green with the full suite plus the race detector.
Scope
The hardening spans file tools, browser/network tools, performance/parallel tools, the Web UI server, sub-agent handling, config loading, and the @-resource resolver. Every change is minimal and follows the existing code style.
Batches
odek serveCSRF / clickjacking / security headers, untrusted-content wrappers for head_tail / diff / sort / tr / json_query--ctxprompt wrapping, resource file-size cap, delegate_tasks summary cap, patch / batch_patch output-expansion capKey security themes
read_file,search_files,multi_grep,batch_read,head_tail,diff,sort,tr,json_query,glob,file_info,tree,base64file mode),shell,browser,transcribe,session_search, @-resources,--ctxfiles, and MCP tools now wrap external output in a per-call nonce wrapper.search_filesandFileResolver.SearchuseLstatto avoid following symlinks for metadata.browserandhttp_batchre-classify redirect hops;browserenforces a request timeout and caps extracted elements.Testing
Both pass.
Documentation
AGENTS.mdupdated with the new defenses and constants.docs/SECURITY.mdupdated with the untrusted-content wrapper table, config-size-cap section, and resource-resolver symlink note.